252 research outputs found

    Explaining AI: Are We Ready For It?

    Get PDF
    Wrede B. Explaining AI: Are We Ready For It? Künstliche Intelligenz. 2020;34(1):1-3

    Modelling the effects of speech rate variation for automatic speech recognition

    Get PDF
    Wrede B. Modelling the effects of speech rate variation for automatic speech recognition. Bielefeld (Germany): Bielefeld University; 2002.In automatic speech recognition it is a widely observed phenomenon that variations in speech rate cause severe degradations of the speech recognition performance. This is due to the fact that standard stochastic based speech recognition systems specialise on average speech rate. Although many approaches to modelling speech rate variation have been made, an integrated approach in a substantial system still has be to developed. General approaches to rate modelling are based on rate dependent models which are trained with rate specific subsets of the training data. During decoding a signal based rate estimation is performed according to which the set of rate dependent models is selected. While such approaches are able to reduce the word error rate significantly, they suffer from shortcomings such as the reduction of training data and the expensive training and decoding procedure. However, phonetic investigations show that there is a systematic relationship between speech rate and the acoustic characteristics of speech. In fast speech a tendency of reduction can be observed which can be described in more detail as a centralisation effect and an increase in coarticulation. Centralisation means that the formant frequencies of vowels tend to shift towards the vowel space center while increased coarticulation denotes the tendency of the spectral features of a vowel to shift towards those of its phonemic neighbour. The goal of this work is to investigate the possibility to incorporate the knowledge of the systematic nature of the influence of speech rate variation on the acoustic features in speech rate modelling. In an acoustic-phonetic analysis of a large corpus of spontaneous speech it was shown that an increased degree of the two effects of centralisation and coarticulation can be found in fast speech. Several measures for these effects were developed and used in speech recognition experiments with rate dependent models. A thorough investigation of rate dependent models showed that with duration and coarticulation based measures significant increases of the performance could be achieved. It was shown that by the use of different measures the models were adapted either to centralisation or coarticulation. Further experiments showed that by a more detailed modelling with more rate classes a further improvement can be achieved. It was also observed that a general basis for the models is needed before rate adaptation can be performed. In a comparison to other sources of acoustic variation it was shown that the effects of speech rate are as severe as those of speaker variation and environmental noise. All these results show that for a more substantial system that models rate variations accurately it is necessary to focus on both, durational and spectral effects. The systematic nature of the effects indicates that a continuous modelling is possible

    Towards Multimodal Perception and Semantic Understanding in a Developmental Model of Speech Acquisition

    Get PDF
    Philippsen A, Wrede B. Towards Multimodal Perception and Semantic Understanding in a Developmental Model of Speech Acquisition. Presented at the 2nd Workshop on Language Learning at Intern. Conf. on Development and Learning (ICDL-Epirob) 2017, Lisbon

    Goal Babbling of Acoustic-Articulatory Models with Adaptive Exploration Noise

    Get PDF
    Philippsen A, Reinhart F, Wrede B. Goal Babbling of Acoustic-Articulatory Models with Adaptive Exploration Noise. Presented at the Sixth Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics (ICDL-EpiRob), Cergy-Pontoise / Paris, France

    Towards Tutoring an Interactive Robot

    Get PDF
    Wrede B, Rohlfing K, Spexard TP, Fritsch J. Towards tutoring an interactive robot. In: Hackel M, ed. Humanoid Robots, Human-like Machines. ARS; 2007: 601-612.Many classical approaches developed so far for learning in a human-robot interaction setting have focussed on rather low level motor learning by imitation. Some doubts, however, have been casted on whether with this approach higher level functioning will be achieved. Higher level processes include, for example, the cognitive capability to assign meaning to actions in order to learn from the tutor. Such capabilities involve that an agent not only needs to be able to mimic the motoric movement of the action performed by the tutor. Rather, it understands the constraints, the means and the goal(s) of an action in the course of its learning process. Further support for this hypothesis comes from parent-infant instructions where it has been observed that parents are very sensitive and adaptive tutors who modify their behavior to the cognitive needs of their infant. Based on these insights, we have started our research agenda on analyzing and modeling learning in a communicative situation by analyzing parent-infant instruction scenarios with automatic methods. Results confirm the well known observation that parents modify their behavior when interacting with their infant. We assume that these modifications do not only serve to keep the infant’s attention but do indeed help the infant to understand the actual goal of an action including relevant information such as constraints and means by enabling it to structure the action into smaller, meaningful chunks. We were able to determine first objective measurements from video as well as audio streams that can serve as cues for this information in order to facilitate learning of actions

    Efficient Bootstrapping of Vocalization Skills Using Active Goal Babbling

    Get PDF
    Philippsen A, Reinhart F, Wrede B. Efficient Bootstrapping of Vocalization Skills Using Active Goal Babbling. Presented at the International Workshop on Speech Robotics at Interspeech 2015, Dresden, Germany.We use goal babbling, a recent approach to bootstrapping inverse models, for vowel acquisition. In contrast to motor babbling, goal babbling organizes exploration in a low-dimensional goal space. While such a goal space is naturally given in many motor learning tasks, the difficulty in modeling speech production lies within the complexity of acoustic features. Often, the first and second formants are used as low-dimensional features. However, formants cannot capture richer characteristics of acoustic signals.We propose to use high-dimensional acoustic features based on a cochlea model and apply dimension reduction in order to generate a low-dimensional goal space. Instead of pre-defining targets in this goal space, we estimate a target distribution from ambient speech with a Gaussian Mixture Model. We demonstrate that goal babbling can be successfully applied in this goal space in order to learn a parametric model of vowel production specialized to a set of ambient speech sounds. By augmenting the goal-directed exploration along linear paths with an active selection of targets, we achieve a significant speed up in learning

    Exploring self-interruptions as a strategy for regaining the attention of distracted users

    Get PDF
    Carlmeyer B, Schlangen D, Wrede B. Exploring self-interruptions as a strategy for regaining the attention of distracted users. In: Proceedings of the 1st Workshop on Embodied Interaction with Smart Environments - EISE '16. New York, NY: Association for Computing Machinery (ACM); 2016: 1

    Pragmatic Frames for Teaching and Learning in Human-Robot interaction: Review and Challenges

    Get PDF
    Vollmer A-L, Wrede B, Rohlfing KJ, Oudeyer P-Y. Pragmatic Frames for Teaching and Learning in Human-Robot interaction: Review and Challenges. FRONTIERS IN NEUROROBOTICS. 2016;10: 10.One of the big challenges in robotics today is to learn from human users that are inexperienced in interacting with robots but yet are often used to teach skills flexibly to other humans and to children in particular. A potential route toward natural and efficient learning and teaching in Human-Robot Interaction (HRI) is to leverage the social competences of humans and the underlying interactional mechanisms. In this perspective, this article discusses the importance of pragmatic frames as flexible interaction protocols that provide important contextual cues to enable learners to infer new action or language skills and teachers to convey these cues. After defining and discussing the concept of pragmatic frames, grounded in decades of research in developmental psychology, we study a selection of HRI work in the literature which has focused on learning-teaching interaction and analyze the interactional and learning mechanisms that were used in the light of pragmatic frames. This allows us to show that many of the works have already used in practice, but not always explicitly, basic elements of the pragmatic frames machinery. However, we also show that pragmatic frames have so far been used in a very restricted way as compared to how they are used in human-human interaction and argue that this has been an obstacle preventing robust natural multi-task learning and teaching in HRI. In particular, we explain that two central features of human pragmatic frames, mostly absent of existing HRI studies, are that (1) social peers use rich repertoires of frames, potentially combined together, to convey and infer multiple kinds of cues; (2) new frames can be learnt continually, building on existing ones, and guiding the interaction toward higher levels of complexity and expressivity. To conclude, we give an outlook on the future research direction describing the relevant key challenges that need to be solved for leveraging pragmatic frames for robot learning and teaching

    Towards Closed Feedback Loops in HRI

    Get PDF
    Carlmeyer B, Schlangen D, Wrede B. Towards Closed Feedback Loops in HRI. In: Proceedings of the 2014 Workshop on Multimodal, Multi-Party, Real-World Human-Robot Interaction - MMRWHRI '14. New York, NY: Association for Computing Machinery (ACM); 2014: 1
    • …
    corecore